Investigation of Property Valuation Models Based on Decision Tree Ensembles Built over Noised Data
نویسندگان
چکیده
The ensemble machine learning methods incorporating bagging, random subspace, random forest, and rotation forest employing decision trees, i.e. Pruned Model Trees, as base learning algorithms were developed in WEKA environment. The methods were applied to the real-world regression problem of predicting the prices of residential premises based on historical data of sales/purchase transactions. The accuracy of ensembles generated by the methods was compared for several levels of noise injected into an attribute, output, and both attribute and output. Ensembles built using rotation forest outperformed other models. In turn, random subspace method resulted in the models that were the most resistant to noised data.
منابع مشابه
Evaluation of liquefaction potential based on CPT results using C4.5 decision tree
The prediction of liquefaction potential of soil due to an earthquake is an essential task in Civil Engineering. The decision tree is a tree structure consisting of internal and terminal nodes which process the data to ultimately yield a classification. C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the...
متن کاملThe Investigation of Deep Data Representations Based on Decision Tree Ensembles for Classification Problems
A classification method based on deep representation of input data and ensembles of decision trees is introduced and evaluated solving the problem of vehicle classification and image classification with large number of categories.
متن کاملOn Oblique Random Forests
Abstract. In his original paper on random forests, Breiman proposed two different decision tree ensembles: one generated from “orthogonal” trees with thresholds on individual features in every split, and one from “oblique” trees separating the feature space by randomly oriented hyperplanes. In spite of a rising interest in the random forest framework, however, ensembles built from orthogonal tr...
متن کاملTHE VALUATION OF PATENTS : A review of patent valuation methods with consideration of option based methods and the potential for further research
Intellectual Property Rights (IPRs) are viewed as being of increasing importance in many fields of business. However, one potential hindrance to their being considered of significant value, is the lack of appreciation of practical methods of valuing them particularly early in their life under conditions of uncertainty about their future prospects. Lack of practical valuation methods under such ...
متن کاملThe Generalization Paradox of Ensembles
Ensemble models—built by methods such as bagging, boosting, and Bayesian model averaging—appear dauntingly complex, yet tend to strongly outperform their component models on new data. Doesn’t this violate “Occam’s razor”—the widespread belief that “the simpler of competing alternatives is preferred”? We argue no: if complexity is measured by function rather than form—for example, according to g...
متن کامل